Comparison of Four Data Mining Algorithms for Predicting Colorectal Cancer Risk

Authors

  • Hadi Kazemi-Arpanahi Dept. of Health Information Technology, Abadan Faculty of Medical Sciences, Abadan, Iran.
  • Mostafa Shanbehzadeh Dept. of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran.
  • Raoof Nopour Dept.of Health Information Technology,School of Allied Medical Sciences, Tehran University of Medical Sciences, Tehran, Iran.
Abstract:

Background and Objective: Colorectal cancer (CRC) is one of the most prevalent malignancies in the world. The early detection of CRC is not only a simple process, but it is also the key to its treatment. Given that data mining algorithms could be potentially useful in cancer prognosis, diagnosis, and treatment, the main focus of this study is to measure the performance of some data mining classifier algorithms in terms of predicting CRC and providing an early warning to the high-risk groups. Materials and Methods: This study was performed in 468 subjects (194 CRC patients and 274 non-CRC cases). We used the CRC dataset from the Imam Hospital, Sari, Iran. The Chi-square feature selection method was utilized to analyze the risk factors. Then, four popular data mining algorithms were compared based on their performance in predicting CRC, and, finally, the best algorithm was identified. Results: The best outcome was obtained by J-48 (F-Measure = 0.826, ROC=0.881, precision= 0.826 and sensitivity =0.827), Bayesian Net was the second-best performer (F-Measure = 0.718, ROC=0.784, precision= 0.719 and sensitivity=0.722). Random-Forest performed the third-best (F-Measure= 0.705, ROC=0.758, precision= 0.719, and sensitivity=0.712). Finally, the MLP technique performed the worst (F-Measure = 0.702, ROC=0.76, precision = 0.701 and sensitivity=0.703).                                                                       Conclusion: According to the results, we concluded that the J-48 could provide better insights than other proposed prediction models for clinical applications.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

Predicting Type2 Diabetes Using Data Mining Algorithms

Background and purpose: Today, information systems and databases are widely used and in order to achieve higher accuracy and speed in making diagnosis, preventing the diseases, and choosing treatments they should be merged with traditional methods. This study aimed at presenting an accurate system for diagnosis of diabetes using data mining and a heuristic method combining neural network and pa...

full text

Comparison of the Efficiency of Data Mining Algorithms in Predicting the Diagnosis of Diabetes

Background: Diabetes is one of the major health problems in Iran and about 4.6 million adults suffer from this disease. Poor diagnosis of this disease has caused half of this number to be unaware of their disease. In recent years, along with the use of computers in data analysis and storage, the volume and complexity of data has increased dramatically. Methods: In health organizations, data pl...

full text

Predicting the Credit Risk of Loans Using Data Mining Tools

 One of the most common causes or credit phenomenon that is taken into account for credit risk is the customer’s noncompliance with the commitments. Thus, by predicting the behavior of loan applicants, the growth rate of debts can be decreased. Hence, this study is conducted on corporate applicants for loans in one of the public banks in Iran. In this paper, the main elements comprising the cus...

full text

Comparing Three Data Mining Algorithms for Identifying the Associated Risk Factors of Type 2 Diabetes

Background: Increasing the prevalence of type 2 diabetes has given rise to a global health burden and a concern among health service providers and health administrators. The current study aimed at developing and comparing some statistical models to identify the risk factors associated with type 2 diabetes. In this light, artificial neural network (ANN), support vector machines (SVMs), and multi...

full text

Evaluation of Data Mining Algorithms for Detection of Liver Disease

Background and Aim: The liver, as one of the largest internal organs in the body, is responsible for many vital functions including purifying and purifying blood, regulating the body's hormones, preserving glucose, and the body. Therefore, disruptions in the functioning of these problems will sometimes be irreparable. Early prediction of these diseases will help their early and effective treatm...

full text

Using data mining techniques for predicting the survival rate of breast cancer patients: a review article

    This review was conducted between December 2018 and March 2019 at Isfahan University of Medical Sciences. A review of various studies revealed what data mining techniques to predict the probability of survival, what risk factors for these predictions, what criteria for evaluating data mining techniques, and finally what data sources for it have been used to predict the surv...

full text

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 29  issue 133

pages  100- 108

publication date 2021-02

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023